An Efficient Voice Activity Detection Method using Bi-Level HMM
نویسندگان
چکیده
منابع مشابه
Applying the Bi-level HMM for Robust Voice-activity Detection
This paper presents a voice-activity detection (VAD) method for sound sequences with various SNRs. For real-time VAD applications, it is inadequate to employ a post-processing for the removal of burst clippings from the VAD output decision. To tackle this problem, building on the bilevel hidden Markov model, for which a state layer is inserted into a typical hidden Markov model (HMM), we formul...
متن کاملRobust visual speakingness detection using bi-level HMM
Visual voice activity detection (V-VAD) plays an important role in both HCI and HRI, affecting both the conversation strategy and sync between humans and robots/computers. The typical speakingness decision of V-VAD consists of post-processing for signal smoothing and classification using thresholding. Several parameters, ensuring a good trade-off between hit rate and false alarm, are usually he...
متن کاملA hybrid HMM/traps model for robust voice activity detection
We present three voice activity detection (VAD) algorithms that are suitable for the off-line processing of noisy speech and compare their performance on SPINE-2 evaluation data using speech recognition error rate as the quality metric. One VAD system is a simple HMM-based segmenter that uses normalized log-energy and a degree of voicing measure as raw features. The other two VAD systems focus ...
متن کاملEfficient voice activity detection algorithms using long-term speech information
Currently, there are technology barriers inhibiting speech processing systems working under extreme noisy conditions. The emerging applications of speech technology, especially in the fields of wireless communications, digital hearing aids or speech recognition, are examples of such systems and often require a noise reduction technique operating in combination with a precise voice activity dete...
متن کاملDuration-embedded bi-HMM for expressive voice conversion
This paper presents a duration-embedded Bi-HMM framework for expressive voice conversion. First, Ward’s minimum variance clustering method is used to cluster all the conversion units (sub-syllables) in order to reduce the number of conversion models as well as the size of the required training database. The duration-embedded Bi-HMM trained with the EM algorithm is built for each sub-syllable cl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Journal of the Korea institute of electronic communication sciences
سال: 2015
ISSN: 1975-8170
DOI: 10.13067/jkiecs.2015.10.8.901